Skip to content

Conversation

@rongxin-liu
Copy link
Contributor

@rongxin-liu rongxin-liu commented Aug 3, 2025

This PR updates the way Python packages are installed in the Dockerfile to improve dependency maintainability.

Currently, it is unclear which package versions are used when building the image. This can sometimes lead to unexpected or breaking changes in production, especially since almost all our web apps are built on the Flask stack.

@rongxin-liu rongxin-liu self-assigned this Aug 3, 2025
@dmalan
Copy link
Member

dmalan commented Aug 3, 2025

What sort of breakages have we experienced? Generally, we want the latest of our own packages anyway, so this would add a risk that cs50/cli will drift out of date if we forget to update.

@dmalan
Copy link
Member

dmalan commented Aug 3, 2025

Also, since all other packages are confined to Dockerfile, why not just specify versions with == therein, rather than introduce an external file?

@rongxin-liu
Copy link
Contributor Author

Also, since all other packages are confined to Dockerfile, why not just specify versions with == therein, rather than introduce an external file?

It would be easier to run a command that updates all dependencies from requirements.txt or similar text/JSON files, rather than manually checking each dependency’s version and then updating them in the Dockerfile. I’m also thinking of Ruby, NPM, etc. Our Dockerfile isn’t fully self-contained anyway.

That said, it’s fine with me if we want to handle updates manually in the Dockerfile, given that we don’t have many Python dependencies to manage, but it’s not sustainable if this list continues to grow. All our EB deployments already have pinned dependencies and must be updated manually, which I’d actually prefer to do periodically rather than get surprises after EB updates the runtime and a new instance installs the latest dependencies (with breaking changes, e.g., Flask 3.x updates).

@rongxin-liu rongxin-liu closed this Aug 3, 2025
@rongxin-liu
Copy link
Contributor Author

What sort of breakages have we experienced? Generally, we want the latest of our own packages anyway, so this would add a risk that cs50/cli will drift out of date if we forget to update.

A recent issue involved an inflect update that caused inflect.engine() to take significantly longer, breaking some check50 checks (due to the default timeout) and affecting certificates.cs50.io. I had to check the release dates of all dependencies and compare them to our Docker image build timestamp to identify the culprit, doesn't feel like an ideal process to troubleshoot things, all based on my hunch tbh.

@rongxin-liu rongxin-liu deleted the refactor-dependencies branch August 3, 2025 19:37
@dmalan
Copy link
Member

dmalan commented Aug 4, 2025

Gotcha. We would need to start fixating dependencies in, e.g., https://github.com/cs50/check50/blob/main/setup.py as well, then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants